## Implementation of Vision for Robots on FPGA

## Abstract

Computer Vision[3] is an important feature for Robots in many real time applications. Object detection, recognition and tracking[5] can be done using this vision. The field of computer vision provides number of algorithms for object detection and recognition. Scale Invariant Feature Transform (SIFT)[1][4] is an efficient feature detection algorithm. The proposed work concentrates on implementation of SIFT algorithm using sophisticated hardware on a Field Programmable Gate Array (FPGA)[2] using Xilinx Software.

The field of digital image processing refers to processing digital images by means of a digital computer. Before 1960's, the size of a computer was as big as a room. Later due to the invention of transistors and introduction of Very Large Scale Integration (VLSI) technology has drastically changed the size and speed of a computer. The individual components of a computer like Control unit, Arithmetic and Logical Unit and memory can now be integrated on a single chip. Now the size of a computer is as small as our palm. This gives us a good scope in utilisation of these small computers in robots for performing digital image processing in real time environment.

Feature based image matching is a key task in many computer vision applications like object recognition. SIFT proposed by David Lowe is one of the best feature recognition algorithm. The interesting points of any object in an image are extracted using feature description. This feature description extracted from a training image is then used to identify the object in a test image. The features extracted from the training image must be detectable even under changes in image scale, noise and illumination. Such points usually lie at object edges i.e. on high contrast regions. SIFT feature descriptor is invariant to uniform scaling, orientation and partially invariant to illumination changes.

The first stage in SIFT is Scale Space peak selection which consists of selection of potential interest points which are identified by scanning the image over location and scale. Second stage includes localization of candidate keypoints and are eliminated if found to be unstable. The third stage identifies the dominant orientations for each keypoint. The final stage builds a local image descriptor for each keypoint based upon the image gradients in its local neighbourhood.

Robots in real time need to capture a picture using a digital camera. The captured amage is to be processed using an onboard computer. This process identifies the features of the image by comparing with the images existing in the database. The robot then takes a necessary action based on the output from the image processing section. For detection of features in an image, SIFT algorithm is used. The day to day improvement in VLSI technology helps in building more efficient hardware processing units for our daily computational purposes. In our research, we will design and implement an efficient hardware on FPGA to run SIFT algorithm. The same concept can be further implemented for modified SIFT[7][8] algorithm also for video processing[6]. The coding for hardware implementation will be done in Very High Speed Integrated Circuit Hardware Description Language (VHDL) on Xilinx Integrated Software Environment(ISE).

## Implementation



Fig. above shows the schematic diagram generated using Vivado2019.2 of the embedded vision module proposed. The design is capable of video capture using an economic OV7670 camera, and the output result is displayed on a VGA display unit. The designed prototype consists of an OV7670 camera, an Artix-7 based FPGA, and a VGA monitor. The proposed design contains the following interfaces written using VHDL: the OV7670 Camera Controller module, OV7670 Image Capture Camera module, Frame Buffer module and VGA output display unit module. The OV7670 Camera Controller functions based on an I2C module for configuration of the internal registers of the input camera unit. configured, the camera module is used to capture and forward the input images to the buffer module after configuration of these registers.







## References

- [1]. Lifan Yao, Hao Feng, Yiqun Zhu, Zhiguo Jiang, Danpei Zhao, and Wenquan Feng "An architecture of optimised SIFT Feature Detection for an FPGA Implementation of an Image Matcher", IEEE FPT, pp. 30-37, 2009.
- [2]. Vanderlei Bonato, Eduardo Marques and Gearge A. Constantinides –"A Parallel Hardware Architecture for Scale and Rotation Invariant Feature Detection", IEEE Transactions on Circuits and Systems for Video Technology. (Vol.18-No12), pp.1-11,2008.
- [3]. Ana Brandusa Pavel and Catalin Buiu "Development of an embedded Artificial Vision System for an Autonomous robot"-International Journal of Innovative Computing Information and Control, Volume-7 Number-2 Issue-11 February 2011 pp.745-762, 2011.
- [4]. David G. Lowe "Distinctive Image Features from Scale-Invariant Keypoints"-International Journal of Computer Vision, 2004.
- [5]. Sudipta N, Sinha, Jan-Michael Frahm, Marc Pollefeys and Yakup Genc "Feature tracking and matching in Video using programmable graphics hardware"- Springer Veralag London Limited, Machine Vision and Apllications, 2007.
- [6]. Kosuke Mizuno, Hiroki Noguchi, Guangji He, Yosuke Terachi, Tetsuya Kamino, Hiroshi Kawaguchi and Masahiko Yoshimoto "Fast and Low memory bandwidth architecture of SIFT descriptor generation with scalability on speed and accuracy for VGA video", IEEE International Conference on Field Programmable Logic and applications, pp.608-611, 2010.
- [7]. Lakshmana Kumar.A, Dr.R.Ganeshan "Improved navigation for visually challenged with high authentication using a modified algorithm", International Journal of Advanced Research in Computer Science and Technology Vol.2 Issue Special 1, pp. 434-438, 2014.
- [8]. Valeriu Codreanu, Feng Dong, Baoquan Liu, Jos B.T.M.Roerdink, David Williams, Po Yang and Burhan Yasar "GPU-ASIFT: A Fast Fully Affine- Invariant Feature Extraction Algorithm", IEEE-2013.